Skip to content

feat(resource): add alpha clickhouse_postgres_service resource#553

Draft
amogiska wants to merge 15 commits into
amogiska/managed-postgres-api-clientfrom
amogiska/managed-postgres-resource
Draft

feat(resource): add alpha clickhouse_postgres_service resource#553
amogiska wants to merge 15 commits into
amogiska/managed-postgres-api-clientfrom
amogiska/managed-postgres-resource

Conversation

@amogiska
Copy link
Copy Markdown

Summary

  • Builds on top of #544 (Phase 1 — Managed Postgres HTTP client). Merge feat(api): add Managed Postgres HTTP client #544 first. This PR's base ref is the Phase 1 branch so reviewers see only Phase 2's diff.
  • Adds the alpha-tagged clickhouse_postgres_service Terraform resource — Phase 2 of the project plan. Schema, lifecycle (Create with mid-Create partial-state write, Read with chc_ system-tag filter, Update with sparse PATCH + conditional WaitForPostgresLeaveAndReturn, Delete, ImportState), validators (cloud_provider / postgres_version / ha_type / size OneOf, name length 1–50, chc_-prefix tag rejection, tag-value empty-string rejection), and embedded MarkdownDescription with the "differences from clickhouse_service" section.
  • Tags use SetNestedAttribute of {key, value} objects so the round-trip preserves the server's ResourceTagV1 shape (an array of objects with optional value, not a flat map). PostgresUpdate.Tags is the Phase 1 *[]Tag pointer-to-slice — clearing all tags PATCH-sends "tags": [] (regression-tested by TestBuildPostgresUpdate subtest tags cleared).
  • Two documented deviations from the plan (full reasoning in phase-two-implementation.md):
    1. No timeouts {} block. Deferred to Phase 5 so the terraform-plugin-framework-timeouts dependency lands once for multiple beneficiaries (restore + read replica). Phase 2 ships hardcoded 30m/30m/10m budgets as package-level const in postgres_service_constants.go.
    2. Helper-level tests, not framework-lifecycle tests. Matches the codebase convention — no other resource in pkg/resource/ constructs synthetic tfsdk.Plan / tfsdk.State. The plan's _Create_HappyPath / _PartialStatePersists / _ImportState_HydratesFromServer move to Phase 6 alongside the dev-cluster integration tests.
  • A review-driven follow-up commit (8f9f47b) lands on top of the initial implementation:
    • Bug fix: reject value = "" on tags at plan time (otherwise the server's empty-string → no-value normalization causes perpetual plan/state drift).
    • Bug fix: drop booldefault.StaticBool(true) on is_primary (no semantics on a Computed-only attribute).
    • Dead code removed (port validator, dead Create password branch, redundant plan.Password = state.Password in Update).
    • strings.HasPrefix instead of hand-rolled slice compares.
    • Test path construction corrected; new TestTagValueValidator_RejectsEmptyString regression test.

Test plan

  • go build -tags alpha ./... clean
  • go build ./... (stable) clean — proves alpha gating works
  • go test -tags alpha ./... green — 26 new Postgres resource test cases pass, all pre-existing tests still pass
  • go test ./... (stable) green
  • go vet -tags alpha ./... clean
  • golangci-lint run ./... clean
  • make docs-alpha renders docs/resources/postgres_service.md (212 lines) — schema, alpha banner, "differences from clickhouse_service", and nested-tag schema all present. Per the alpha-docs precedent (release.yaml:418-425 + cleanup-alpha-examples job), the generated file is not committed; the embedded source lives at pkg/resource/descriptions/postgres_service.md.
  • Pre-commit hook (make fmt && make docs && make build) clean on both commits
  • Phase 6 integration tests against the dev cluster (deferred to that phase)

@amogiska amogiska force-pushed the amogiska/managed-postgres-api-client branch from c3c2cc9 to 16cf999 Compare May 28, 2026 18:24
amogiska added a commit that referenced this pull request May 28, 2026
Eight review concerns landed in one commit. Three were genuine blockers
that would have surfaced as real user pain post-merge; the rest improve
correctness / observability for the alpha.

High-severity (blockers)
- #1 state attribute UseStateForUnknown: without USFU the framework
  marked state as (known after apply) on every plan, forcing Update
  even on no-op applies — and the no-op branch wrote Unknown straight
  to final state, which the framework rejects with "Provider produced
  inconsistent result after apply." Every other persistent computed
  attribute already had USFU; this one was missed. Fix is the same
  one-liner used at clickpipe.go:179-185.
- #2 ha_type silent downgrade: Default("none") fires not just on
  Create but also when a user deletes the ha_type line on an existing
  resource — silently downgrading async/sync → none. A naive
  UseStateForUnknown won't help (Default beats USFU in plan), so the
  fix is to drop the Default entirely and add UseStateForUnknown.
  The server still defaults to "none" on Create when omitted, so
  fresh resources behave identically. An explicit ha_type = "none"
  still downgrades.
- #3 no-value tags break on PATCH: schema invited null values, but
  server returns 400 BAD_REQUEST on PATCH if any tag entry omits
  value (verified live in Phase 2 e2e v1 step 2). Create accepted
  them; an Update with a no-value tag in state would silently fail.
  Schema's value attribute is now Required to keep Create and Update
  behavior symmetric. Description updated.

Medium-severity
- #4 tag value mutation test: existing matrix never exercised
  changing the value on an existing key. Added the test plus a JSON
  wire-shape assertion for the tag-clear path ("tags": [] must reach
  the wire, not be omitted via marshalling).
- #5 size OneOf dropped: the 82-entry compile-time snapshot meant
  every new AWS instance family needed a provider patch release
  before users could adopt it. For the most-frequently-changed
  attribute the friction outweighed plan-time validation. Matches
  region's "server validates" pattern. The plan itself flagged this
  trade-off as worth dropping; the review made the case now to act.
- #6 context cancellation in waits: the wait helpers' outer
  backoff.Retry and doRequest's inner backoff both ignored ctx
  cancellation, so Ctrl-C / Terraform deadlines could spin up to
  the full retry budget. Wrapped both with backoff.WithContext.
  Regression test in pkg/internal/api/postgres_test.go asserts
  cancellation aborts within ~5s on a 5s-interval / 100-retry
  budget (would otherwise be 500s). Touches a Phase 1 file but
  the fix is strictly additive — all existing tests still pass.
- #7 log verbatim 409 body on dependent-replica fail-fast:
  errIndicatesDependentReplica is a speculative keyword heuristic
  (FIXME for Phase 6). Added a tflog.Warn capturing the verbatim
  error so Phase 6's dependent-replica integration test has real
  data to anchor on (or replace with a stable error code).

Nit
- Delete missing return after AddError: consistency with
  Create/Read/Update; framework already preserves state when
  diagnostics contain an error, but the missing return is a
  future-fall-through trap.

Documentation
- descriptions/postgres_service.md updated to reflect the schema
  changes: ha_type omission preserves prior state, value is now
  required on tags, size has no client-side enum, known-limitation
  text rewritten.

Verification
- go build -tags alpha ./... clean; go build ./... clean
- go test -tags alpha ./... green
- go test ./... (stable) green
- golangci-lint run ./... clean
amogiska added a commit that referenced this pull request May 28, 2026
Eight review concerns landed in one commit. Three were genuine blockers
that would have surfaced as real user pain post-merge; the rest improve
correctness / observability for the alpha.

High-severity (blockers)
- #1 state attribute UseStateForUnknown: without USFU the framework
  marked state as (known after apply) on every plan, forcing Update
  even on no-op applies — and the no-op branch wrote Unknown straight
  to final state, which the framework rejects with "Provider produced
  inconsistent result after apply." Every other persistent computed
  attribute already had USFU; this one was missed. Fix is the same
  one-liner used at clickpipe.go:179-185.
- #2 ha_type silent downgrade: Default("none") fires not just on
  Create but also when a user deletes the ha_type line on an existing
  resource — silently downgrading async/sync → none. A naive
  UseStateForUnknown won't help (Default beats USFU in plan), so the
  fix is to drop the Default entirely and add UseStateForUnknown.
  The server still defaults to "none" on Create when omitted, so
  fresh resources behave identically. An explicit ha_type = "none"
  still downgrades.
- #3 no-value tags break on PATCH: schema invited null values, but
  server returns 400 BAD_REQUEST on PATCH if any tag entry omits
  value (verified live in Phase 2 e2e v1 step 2). Create accepted
  them; an Update with a no-value tag in state would silently fail.
  Schema's value attribute is now Required to keep Create and Update
  behavior symmetric. Description updated.

Medium-severity
- #4 tag value mutation test: existing matrix never exercised
  changing the value on an existing key. Added the test plus a JSON
  wire-shape assertion for the tag-clear path ("tags": [] must reach
  the wire, not be omitted via marshalling).
- #5 size OneOf dropped: the 82-entry compile-time snapshot meant
  every new AWS instance family needed a provider patch release
  before users could adopt it. For the most-frequently-changed
  attribute the friction outweighed plan-time validation. Matches
  region's "server validates" pattern. The plan itself flagged this
  trade-off as worth dropping; the review made the case now to act.
- #6 context cancellation in waits: the wait helpers' outer
  backoff.Retry and doRequest's inner backoff both ignored ctx
  cancellation, so Ctrl-C / Terraform deadlines could spin up to
  the full retry budget. Wrapped both with backoff.WithContext.
  Regression test in pkg/internal/api/postgres_test.go asserts
  cancellation aborts within ~5s on a 5s-interval / 100-retry
  budget (would otherwise be 500s). Touches a Phase 1 file but
  the fix is strictly additive — all existing tests still pass.
- #7 log verbatim 409 body on dependent-replica fail-fast:
  errIndicatesDependentReplica is a speculative keyword heuristic
  (FIXME for Phase 6). Added a tflog.Warn capturing the verbatim
  error so Phase 6's dependent-replica integration test has real
  data to anchor on (or replace with a stable error code).

Nit
- Delete missing return after AddError: consistency with
  Create/Read/Update; framework already preserves state when
  diagnostics contain an error, but the missing return is a
  future-fall-through trap.

Documentation
- descriptions/postgres_service.md updated to reflect the schema
  changes: ha_type omission preserves prior state, value is now
  required on tags, size has no client-side enum, known-limitation
  text rewritten.

Verification
- go build -tags alpha ./... clean; go build ./... clean
- go test -tags alpha ./... green
- go test ./... (stable) green
- golangci-lint run ./... clean
@amogiska amogiska force-pushed the amogiska/managed-postgres-resource branch 2 times, most recently from 62fe692 to a11d326 Compare June 1, 2026 17:39
@amogiska amogiska force-pushed the amogiska/managed-postgres-api-client branch from dc6f705 to 9795ee1 Compare June 1, 2026 21:05
amogiska added 8 commits June 1, 2026 14:06
Implements Phase 2 of the Managed Postgres provider work: an alpha-tagged
clickhouse_postgres_service resource providing CRUD + Import on the
ClickHouse Cloud Managed Postgres API. Builds on the Phase 1 HTTP client.

Surface
- Schema attributes: id, name, cloud_provider, region, postgres_version,
  size, ha_type, tags (SetNestedAttribute of {key, value}), state,
  created_at, is_primary, hostname, port, username, connection_string
  (Sensitive), password (Sensitive, Computed-only in Phase 2).
- Lifecycle: Configure, Create (with explicit mid-Create partial-state
  write of id + server-generated password), Read (filters chc_-prefixed
  tags), Update (sparse PATCH; *[]Tag clears tags correctly; waits for
  size/ha_type transitions via WaitForPostgresLeaveAndReturn), Delete
  (thin wrapper — retry/poll lives in the client), ImportState by id.
- Validators: cloud_provider/postgres_version/ha_type/size OneOf, name
  length 1-50, chc_ tag prefix rejected at plan time.
- Constants snapshotted from cp-common protocol (82 VM_SPECS keys,
  PG_VERSIONS, POSTGRES_HA_TYPES, CLOUD_PROVIDERS).

Tests (helper-level, matching codebase convention)
- syncPostgresState round-trip with primary/HA/missing-IsPrimary cases
- apiTagsToSetValue chc_ filtering, empty/system-only/empty-value paths
- planTagsToAPI null/unknown/populated/null-value-attr paths
- buildPostgresUpdate diff matrix: no-op, size-only, ha-only, tags-only,
  tags-cleared (regression test for *[]Tag), combined
- notReservedTagPrefixValidator accept/reject/null/short-key
- isPostgresStateRunning forward-compat

Phase 2 deviations from the plan (documented in phase-two-implementation.md):
1. timeouts {} block deferred to Phase 5 (no existing resource uses it;
   restore/replica flows in Phase 5 will benefit from the same dependency).
   Phase 2 ships hardcoded 30m/30m/10m as package-level Go constants.
2. Lifecycle-level tests (Create/Update/Delete/Import) deferred to Phase 6
   integration tests (matches codebase convention — no resource_test.go in
   pkg/resource/ builds synthetic tfsdk.Plan/State).

Per the alpha-docs precedent (release.yaml:418-425 + cleanup-alpha-examples
job), docs/resources/postgres_service.md is generated by make docs-alpha
during release and is not committed to main. The embedded source lives at
pkg/resource/descriptions/postgres_service.md.

Refs: Phase 2 of clickhouse-postgres-service-tf-plan.md
Review-driven follow-up to 2af1b7c. Two real bugs plus dead code and
polish.

Bug fixes
- Reject empty-string tag values at plan time. The server normalizes
  "" to no-value, so apiTagsToSetValue would materialize that back as
  types.StringNull() — perpetual plan/state drift for any user who
  writes value = "". Add stringvalidator.LengthAtLeast(1) on the
  tags.value attribute and a TestTagValueValidator_RejectsEmptyString
  regression test. Description doc rewritten to direct users at null
  or omission instead.
- Drop booldefault.StaticBool(true) from is_primary. The Default
  field has no semantics on a Computed-only (non-Optional) attribute;
  syncPostgresState already supplies the fallback when the server
  omits the field.

Dead code removed
- int64validator.Between(1, 65535) on port (validators don't fire on
  Computed-only attributes).
- else-if final.Password != nil branch in Create — GetPostgres never
  echoes the password.
- plan.Password = state.Password in Update — UseStateForUnknown
  already carries the prior state value through to plan.

Polish
- strings.HasPrefix instead of hand-rolled slice compares in
  apiTagsToSetValue and notReservedTagPrefixValidator.
- Fix TestNotReservedTagPrefixValidator path construction: SetValue
  elements are Object values, not String values. Tests passed by
  accident because the validator doesn't inspect Path content.

Verification
- go build -tags alpha ./... clean
- go test -tags alpha ./... green (26 Postgres test cases, all pass)
- go test ./... (stable) green
- go vet -tags alpha ./... clean
- golangci-lint run clean
- make docs-alpha renders correctly
Bundles the open review concerns plus the data-loss bug caught live in
the Phase 2 e2e run against the dev cluster.

Bug: silent tag wipe on unrelated updates (e2e-caught, P0)
- Add setplanmodifier.UseStateForUnknown() to the tags SetNestedAttribute.
  Without it, the framework marks tags=(known after apply) on every plan,
  the resource layer receives Tags=Unknown in Update, planTagsToAPI
  returns nil, and buildPostgresUpdate treats nil as "clear all tags" —
  silently wiping any tags the user has set whenever they change any
  other attribute. Reproduced on the live resize test (r6gd.large →
  r6gd.xlarge); PATCH body included an unrequested "tags": [].
- Defense-in-depth: buildPostgresUpdate now explicitly skips the tags
  branch when plan.Tags.IsUnknown(). Even if a future regression drops
  the schema plan modifier, the diff logic refuses to clear tags while
  the plan is unresolved.
- Regression test: TestBuildPostgresUpdate/plan.Tags == Unknown is
  treated as 'leave server tags alone'.

Drift fix: explicit tags=[] is rejected at plan time
- Add setvalidator.SizeAtLeast(1). The server→state round-trip collapses
  empty arrays to SetNull (chc_-filtering produces an empty filtered
  list), so an explicit empty set in config would diff perpetually
  against null state. Users wanting no tags omit the attribute;
  UseStateForUnknown carries prior state forward.
- Tighten setvalidator.SizeAtMost to 50 (was 64) to match the server's
  MAX_TAGS_PER_RESOURCE constant.

Lifecycle test gap: extract buildPartialCreateState helper + unit test
- Move the mid-Create partial-state population from inline-in-Create
  into a standalone helper. Phase 2's "most novel piece" — the state
  write between CreatePostgres success and the post-wait re-read — is
  now testable without constructing synthetic tfsdk.Plan/State.
- TestBuildPartialCreateState covers: server-generated password
  branch, no-password branch (Phase 4 preview), user-set HaType /
  PostgresVersion / Tags preservation, and the critical "every other
  computed attr must be explicit Null (not Unknown)" contract.

Description: import-password gap, no-value PATCH constraint, no-empty-list
- Make the import gap impossible to miss: prominent warning callout
  with the connection_string workaround and Phase 4 plan.
- Document the server's PATCH-time "value is required" constraint
  (server-side 400 even though Create accepts no-value tags — Phase 2
  e2e finding).
- Document the new no-empty-list rule and explain why.

Polish (review nits)
- buildPostgresUpdate now returns a postgresUpdatePlan struct
  {Body, TransitionExpected} instead of three positional values. Read
  better at the call site. All tests updated.
- apiTagsToSetValue's internal-error message now names the concrete
  unexpected type instead of a generic "internal error".
- syncPostgresState empty-string-fallthrough behavior documented
  inline (deliberate: prefer prior state to silent overwrite).
- is_primary nil-defaulting-to-true behavior documented inline with
  a Phase 5 revisit pointer (mismarks a replica if server starts
  returning nil for replicas).

Verification
- go build -tags alpha ./... clean
- go test -tags alpha ./... green (37+ Postgres test cases pass)
- go test ./... (stable) green
- go vet -tags alpha ./... clean
- golangci-lint run ./... clean

E2E run captured in phase-two-e2e-notes.md (uncommitted, lives outside
the provider repo).
…round)

E2E v2 against the dev cluster (2026-05-28) surfaced a server-side PATCH
behavior that the original Phase 2 design did not account for:

  Sending PATCH /postgres/{id} with body {"size":"r6gd.xlarge"} (no
  tags field) returns 200 AND clears all server-side tags.

The provider's request body was correct — just {"size":...} — but the
server's downstream Ubicloud call treats a missing tags field as
clear-all rather than no-change. Confirmed against the dev cluster:
instance had two tags before the resize PATCH, server-side tags = []
immediately after, despite no tag-clear request from the provider.

This is asymmetric with size / ha_type, which the server DOES preserve
when omitted. The asymmetry isn't documented in the OpenAPI handler
validator. Worth a server-side ticket (FIXME placed inline); meanwhile
the provider works around it.

Fix
- buildPostgresUpdate now always re-asserts the current state's tags
  in the PATCH body whenever it's also mutating size or ha_type. If
  state has no tags, the field stays nil (safe — server has no tags
  to clear).
- diffTags helper extracted to make the three-way intent (no-change,
  clear, replace) explicit. Plan.Tags == Unknown still defends against
  a regression in UseStateForUnknown — when Unknown meets non-empty
  state, the workaround forces the state tags into the body.

Tests
- Existing TestBuildPostgresUpdate/plan.Tags == Unknown test was
  asserting the OLD (buggy) behavior; updated to assert the new
  server-clear-defense semantics.
- Two new regression subtests:
  - size-only diff with non-empty state tags → tags re-asserted
  - size-only diff with no state tags → tags stays nil

Description
- New "Server-side PUT-like tag semantics" callout in differences-from-
  clickhouse_service section. Users will see tags repeated in
  TF_LOG=DEBUG bodies on non-tag mutations; harmless and explains why.

Verification
- go build -tags alpha ./... clean
- go test -tags alpha ./... green (3 new test subtests pass; all
  previous tests still green)
- go test ./... (stable) green
- golangci-lint run ./... clean
Eight review concerns landed in one commit. Three were genuine blockers
that would have surfaced as real user pain post-merge; the rest improve
correctness / observability for the alpha.

High-severity (blockers)
- #1 state attribute UseStateForUnknown: without USFU the framework
  marked state as (known after apply) on every plan, forcing Update
  even on no-op applies — and the no-op branch wrote Unknown straight
  to final state, which the framework rejects with "Provider produced
  inconsistent result after apply." Every other persistent computed
  attribute already had USFU; this one was missed. Fix is the same
  one-liner used at clickpipe.go:179-185.
- #2 ha_type silent downgrade: Default("none") fires not just on
  Create but also when a user deletes the ha_type line on an existing
  resource — silently downgrading async/sync → none. A naive
  UseStateForUnknown won't help (Default beats USFU in plan), so the
  fix is to drop the Default entirely and add UseStateForUnknown.
  The server still defaults to "none" on Create when omitted, so
  fresh resources behave identically. An explicit ha_type = "none"
  still downgrades.
- #3 no-value tags break on PATCH: schema invited null values, but
  server returns 400 BAD_REQUEST on PATCH if any tag entry omits
  value (verified live in Phase 2 e2e v1 step 2). Create accepted
  them; an Update with a no-value tag in state would silently fail.
  Schema's value attribute is now Required to keep Create and Update
  behavior symmetric. Description updated.

Medium-severity
- #4 tag value mutation test: existing matrix never exercised
  changing the value on an existing key. Added the test plus a JSON
  wire-shape assertion for the tag-clear path ("tags": [] must reach
  the wire, not be omitted via marshalling).
- #5 size OneOf dropped: the 82-entry compile-time snapshot meant
  every new AWS instance family needed a provider patch release
  before users could adopt it. For the most-frequently-changed
  attribute the friction outweighed plan-time validation. Matches
  region's "server validates" pattern. The plan itself flagged this
  trade-off as worth dropping; the review made the case now to act.
- #6 context cancellation in waits: the wait helpers' outer
  backoff.Retry and doRequest's inner backoff both ignored ctx
  cancellation, so Ctrl-C / Terraform deadlines could spin up to
  the full retry budget. Wrapped both with backoff.WithContext.
  Regression test in pkg/internal/api/postgres_test.go asserts
  cancellation aborts within ~5s on a 5s-interval / 100-retry
  budget (would otherwise be 500s). Touches a Phase 1 file but
  the fix is strictly additive — all existing tests still pass.
- #7 log verbatim 409 body on dependent-replica fail-fast:
  errIndicatesDependentReplica is a speculative keyword heuristic
  (FIXME for Phase 6). Added a tflog.Warn capturing the verbatim
  error so Phase 6's dependent-replica integration test has real
  data to anchor on (or replace with a stable error code).

Nit
- Delete missing return after AddError: consistency with
  Create/Read/Update; framework already preserves state when
  diagnostics contain an error, but the missing return is a
  future-fall-through trap.

Documentation
- descriptions/postgres_service.md updated to reflect the schema
  changes: ha_type omission preserves prior state, value is now
  required on tags, size has no client-side enum, known-limitation
  text rewritten.

Verification
- go build -tags alpha ./... clean; go build ./... clean
- go test -tags alpha ./... green
- go test ./... (stable) green
- golangci-lint run ./... clean
The errIndicatesDependentReplica heuristic shipped without ever observing
the real server response. The Phase 2 resource doesn't support read
replicas (Phase 5 ships read_replica_of), so we couldn't trigger the
scenario from Terraform. The heuristic was guessing on three counts:

1. That the server returns HTTP 409 (might be 422, 400, or 500).
2. That the body contains the words "depend" AND "replica" (might be
   "child resource", "blocked by downstream", or structured errorCode).
3. That the scenario even produces an error at all (Ubicloud might
   cascade-delete or orphan the replica instead).

Three rounds of e2e tests against the dev cluster never exercised this
code path. The unit test asserted behavior against a synthetic 409 with
hand-written text matching the heuristic — proving only the matcher
works against its own anchor, not that the anchor reflects reality.

Removed:
- errIndicatesDependentReplica function
- The if-branch in deletePostgresWithBudget that called it
- tflog.Warn line in the same branch
- TestDeletePostgres_FailsFastOnDependentReplica (synthetic match)
- TestDeletePostgres_RetriesOn409WithoutDependentSignal (negative test)
- tflog + strings imports (now unused)
- Comment reference in pkg/resource/postgres_service.go Delete

Net behavior: 409s now uniformly retry for ~15 minutes. If Ubicloud
does refuse dependent-replica deletes with a 409, the user waits 15
min and then sees the error — annoying but bounded, and rare because
Terraform's dependency graph normally destroys replicas before primaries.

Reintroduction plan: Phase 5 ships read_replica_of, making the
primary+replica pair createable from Terraform. The first Phase 6
integration test that attempts a primary delete with a replica still
attached captures the real response. If it's a non-retryable 4xx with
a stable distinguishing field (errorCode, reason, structured JSON),
reintroduce the fail-fast anchored on that field — not a text match.

Verification
- go build -tags alpha ./... clean
- go test -tags alpha ./... green
- go test ./... (stable) green
- golangci-lint run ./... clean
@amogiska amogiska force-pushed the amogiska/managed-postgres-resource branch from a11d326 to be6b706 Compare June 1, 2026 21:12
amogiska added 7 commits June 1, 2026 14:15
The Phase 2 branch had picked up three divergences in pkg/internal/api/
during the PR review iterations:

1. common.go: backoff.WithContext wrap on doRequest's inner retry loop
2. postgres.go: backoff.WithContext on the wait helpers + DeletePostgres;
   halfBudget < 1 guard on the leave-and-return helper; removal of the
   errIndicatesDependentReplica heuristic
3. postgres_test.go: removed the dependent-replica test pair; added a
   TestWaitForPostgresState_HonorsContextCancellation that depended on
   the backoff.WithContext change

Phase 1 (origin/amogiska/managed-postgres-api-client) is the source of
truth for the api/ layer. Phase 2 has no business shipping changes there.

Resolution: `git checkout origin/amogiska/managed-postgres-api-client --
pkg/internal/api/`. Working tree now matches Phase 1 byte-for-byte for
the three files. The corresponding Phase 2 review/e2e findings still
matter — they belong on the Phase 1 branch (or a future patch on top of
the released alpha) rather than on Phase 2.

Resource layer (pkg/resource/...) and descriptions still call into the
Phase 1 client API (WaitForPostgresStateTransitionAndReturn, etc.) which
already had the renames. No resource-layer changes needed; tests still
pass.

Verification
- go build -tags alpha ./... clean
- go test -tags alpha ./... green (whole module)
- go test ./... (stable) green
- golangci-lint run ./... clean
Phase 1 review feedback (theme 10: "don't reference private context in
code") flagged FIXME(phase-X) tags as opaque to reviewers. Phase 2
shipped with 25+ similar references — "Phase 0/1/2/3/4/5", "plan line
158", "e2e v1/v2", with two on user-facing schema descriptions that
surface in the registry doc page.

Strip all of them from:
- pkg/resource/postgres_service.go — schema descriptions for is_primary,
  password, size; doc comments on Create / Update / Delete /
  buildPartialCreateState / buildPostgresUpdate; inline comments on
  ha_type default omission and Tags handling
- pkg/resource/postgres_service_test.go — comments on TestBuildPartial-
  CreateState, the "tags cleared" subtest, the Unknown-Tags subtest,
  and the size-only server-clear-defense subtest
- pkg/resource/postgres_service_constants.go — the snapshot-source
  comment and the dead "postgresSizes was removed" tombstone block,
  plus the timeouts block dev-observation note
- pkg/resource/models/postgres_service_resource.go — type-level doc
  comment listing the Phase-by-Phase additions

Additional cleanup (Phase 1 themes 5 and 11):

- Inline stringvalidatorLengthAtLeast1: one-line wrapper around
  stringvalidator.LengthAtLeast(1) that didn't actually decouple tests
  from the schema. Call the upstream validator directly at the single
  call site.
- Trim the Create doc comment from a 6-step play-by-play (which
  restated the function body) to two sentences naming the only non-
  obvious bit: the mid-Create partial state write for crash recovery.

Verification
- go build -tags alpha ./... clean
- go build ./... (stable) clean
- go test -tags alpha ./... green
- go test ./... (stable) green
- golangci-lint run ./... clean
The embedded description rendered into the registry doc page had 16
references to "Phase 2/3/4/5/18" — internal milestone labels that a
docs reader can't see. Strip all of them and reframe sections as
"what the resource is" rather than "where it is in the rollout."

Also drop the "Differences from clickhouse_service" section entirely.
A resource page shouldn't be a comparison sheet against a sibling
resource: a reader looking at clickhouse_postgres_service wants
Postgres docs, not a delta against an OLAP DB. The substantive bullets
were redistributed:

- Tag shape / required value / chc_ prefix / no empty list / PUT-like
  PATCH semantics → new "Tag semantics" section.
- name immutability → "Known limitations" as a single bullet.
- "No IP allowlist, etc." comparison bullet → folded into "Current
  scope" as one of the not-yet-supported items.

Section header "Phase 2 scope" → "Current scope". Same content, no
calendar dependency.
Matches the convention used by clickhouse_service, the established
sibling resource in this provider. The user-facing UX is now consistent:

  tags = {
    team = "billing"
    env  = "dev"
  }

across both clickhouse_postgres_service and clickhouse_service.

Justification for the original nested-object schema was that the server's
ResourceTagV1.value is optional, and a flat map can't represent null
values. That argument went obsolete when PR #553 review made tag values
Required at the schema layer to mirror the server's PATCH-time 400 on
omitted values. With values guaranteed non-empty, the nested-object
shape carries no information a map can't.

What changed (resource layer only):
- pkg/resource/postgres_service.go: schema attribute is now MapAttribute
  with ElementType = StringType, mapvalidator.{SizeAtLeast, SizeAtMost,
  KeysAre, ValueStringsAre}, mapplanmodifier.UseStateForUnknown
- Helper functions: apiTagsToSetValue → apiTagsToMapValue (returns
  types.Map, drops the chc_ filter + empty-value drop into the
  conversion loop). planTagsToAPI takes types.Map and uses
  ElementsAs[map[string]string].
- buildPartialCreateState uses types.MapNull(types.StringType) for the
  unknown-tags pin.
- The diffTags / buildPostgresUpdate plumbing (including the
  server-clear defense from e2e v2) is untouched conceptually — it
  still produces *[]api.Tag, just from a different source shape.
- pkg/resource/models/postgres_service_resource.go: Tags is now
  types.Map. Dropped PostgresServiceTagModel + PostgresServiceTagObjectType
  (no longer needed).
- Description doc updated to match the new HCL shape.

What did NOT change:
- pkg/internal/api/* — Phase 1 source-of-truth boundary respected.
  The wire shape is still []api.Tag (server contract); only the
  resource-layer translation changed.
- The chc_-prefix server-side reservation, the server-clear-defense
  (PUT-like PATCH semantics), and the *[]Tag clear-vs-omit pointer
  intent in api.PostgresUpdate.Tags all still apply.

Tests rewritten end-to-end for the map shape; all 24 unit-test
subtests pass. Net -153 lines.

Verification
- go build -tags alpha ./... clean
- go build ./... (stable) clean
- go test -tags alpha ./... green
- go test ./... (stable) green
- golangci-lint run ./... clean
1. Empty-string guard for state and created_at.
   postgres_models.go marks both fields json:"...,omitempty"; a server
   GET mid-transition can omit them, in which case the previous code
   silently overwrote tracked values with "". Empty state confuses
   debuggers; empty created_at breaks downstream formatdate / timeadd
   in user configs.

   Applies the same `if pg.X != ""` guard already used by size and
   postgres_version above.

2. Pin Password contract in modelsEqual + dedicated regression test.
   syncPostgresState intentionally never touches state.Password (the
   server doesn't echo it on GET, so the framework value written at
   Create time is the only source of truth). The previous modelsEqual
   helper silently skipped Password, so any future regression that
   clobbered Password would have been invisible. Added Password to the
   pairs slice and a TestPostgresResource_syncPostgresState_preservesPassword
   case that pre-populates Password and asserts it survives sync.

   Existing table tests that didn't set want.Password rely on the zero
   value matching got's untouched zero value — no fixture updates
   needed for those.

3. Adjusted two table cases that previously asserted
   State/CreatedAt == StringValue("") to reflect the new contract:
   when the server omits the field, the resource model leaves the
   slot at its zero (unset) value rather than writing "".

Verification
- go build -tags alpha ./... clean
- go build ./... (stable) clean
- go test -tags alpha ./... green (incl. new preservesPassword case)
- go test ./... (stable) green
- golangci-lint run ./... clean
Rewrites the registry description so it reads as documentation for the
resource as it exists, not as a draft for what the resource might become.

- "Current scope" → "Supported lifecycle" + "Unsupported attributes"
- "This release ships..." dropped
- "not yet supported" → "intentionally absent"
- "Today the server generates the password" → "The server generates the password"
- "currently hardcoded" → "hardcoded"
- "this release does not ship" / "any future apply" rephrased to describe
  current behavior without implying a later release will change it
- "Known limitations (alpha)" → "Known limitations" (the alpha callout
  at the top of the page is sufficient context for that header)
- "not user-configurable in this release" → "not user-configurable"

No behavioral or schema changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant